Goto

Collaborating Authors

 Mauricie Region


Holonorm

Yongueng, Daryl Noupa, Tembine, Hamidou

arXiv.org Artificial Intelligence

Normalization is a key point in transformer training . In Dynamic Tanh (DyT), the author demonstrated that Tanh can be used as an alternative layer normalization (LN) and confirmed the effectiveness of the idea. But Tanh itself faces orthogonality, linearity and distortion problems. Due to that, his proposition cannot be reliable. So we propose a Holonorm (hn) which has residual connections and nonlinearity. Holonorm is suitable for replacing Tanh in the context of normalization. Although the HoloNorm expression could be similar to the softsign function in dimension one, softsign is a componentwise function which is not good for tensors and vectors of great dimension. Holonorm preserves the orthogonality, the direction, the invertibility of the signal. Holonorm is also a suitable metric, maps all vectors into the open unit ball. This prevents exploding activations and improves stability in deep Transformer models. In this work, we have meticulously examined the normalization in transformers and say that Holonorm, a generalized form of softsign function suited as a normalization function first.Second, defined between 0 and 1 hn serves as a percentage, and $1 - \text{Holonorm}$ is its complement, making it better understandable in evaluating a model.


District Vitality Index Using Machine Learning Methods for Urban Planners

Marcoux, Sylvain, Dessureault, Jean-Sébastien

arXiv.org Artificial Intelligence

City leaders face critical decisions regarding budget allocation and investment priorities. How can they identify which city districts require revitalization? To address this challenge, a Current Vitality Index and a Long-Term Vitality Index are proposed. These indexes are based on a carefully curated set of indicators. Missing data is handled using K-Nearest Neighbors imputation, while Random Forest is employed to identify the most reliable and significant features. Additionally, k-means clustering is utilized to generate meaningful data groupings for enhanced monitoring of Long-Term Vitality. Current vitality is visualized through an interactive map, while Long-Term Vitality is tracked over 15 years with predictions made using Multilayer Perceptron or Linear Regression. The results, approved by urban planners, are already promising and helpful, with the potential for further improvement as more data becomes available. This paper proposes leveraging machine learning methods to optimize urban planning and enhance citizens' quality of life.


Design an Ontology for Cognitive Business Strategy Based on Customer Satisfaction

Bagherzadeh, Neda, Setayeshi, Saeed, Yazdani, Samaneh

arXiv.org Artificial Intelligence

Ontology is a general term used by researchers who want to share information in a specific domain. One of the hallmarks of the greatest success of a powerful manager of an organization is his ability to interpret unplanned and unrelated events. Tools to solve this problem are vital to business growth. Modern technology allows customers to be more informed and influential in their roles as patrons and critics. This can make or break a business. Research shows that businesses that employ a customer-first strategy and prioritize their customers can generate more revenue. Even though there are many different Ontologies offered to businesses, none of it is built from a cognitive perspective. The objective of this study is to address the concept of strategic business plans with a cognitive ontology approach as a basis for a new management tool. This research proposes to design a cognitive ontology model that links customer measurement with traditional business models, define relationships between components and verify the accuracy of the added financial value.


Improving the adaptive and continuous learning capabilities of artificial neural networks: Lessons from multi-neuromodulatory dynamics

Mei, Jie, Rodriguez-Garcia, Alejandro, Takeuchi, Daigo, Wainstein, Gabriel, Hubig, Nina, Mohsenzadeh, Yalda, Ramaswamy, Srikanth

arXiv.org Artificial Intelligence

Continuous, adaptive learning--the ability to adapt to the environment and improve performance--is a hallmark of both natural and artificial intelligence. Biological organisms excel in acquiring, transferring, and retaining knowledge while adapting to dynamic environments, making them a rich source of inspiration for artificial neural networks (ANNs). This study explores how neuromodulation, a fundamental feature of biological learning systems, can help address challenges such as catastrophic forgetting and enhance the robustness of ANNs in continuous learning scenarios. Driven by neuromodulators including dopamine (DA), acetylcholine (ACh), serotonin (5-HT) and noradrenaline (NA), neuromodulatory processes in the brain operate at multiple scales, facilitating dynamic responses to environmental changes through mechanisms ranging from local synaptic plasticity to global network-wide adaptability. Importantly, the relationship between neuromodulators, and their interplay in the modulation of sensory and cognitive processes are more complex than expected, demonstrating a "many-to-one" neuromodulator-to-task mapping. To inspire the design of novel neuromodulation-aware learning rules, we highlight (i) how multineuromodulatory interactions enrich single-neuromodulator-driven learning, (ii) the impact of neuromodulators across multiple spatial and temporal scales, and correspondingly, (iii) strategies for approximating and integrating neuromodulated learning processes in ANNs. To illustrate these principles, we present a case study to demonstrate how neuromodulation-inspired mechanisms, such as DA-driven reward processing and NA-based cognitive flexibility, can enhance ANN performance in a Go/No-Go task. By integrating multi-scale neuromodulation, we aim to bridge the gap between biological learning and artificial systems, paving the way for ANNs with greater flexibility, robustness, and adaptability.


Integrating Fuzzy Logic with Causal Inference: Enhancing the Pearl and Neyman-Rubin Methodologies

Saki, Amir, Faghihi, Usef

arXiv.org Artificial Intelligence

In this paper, we generalize the Pearl and Neyman-Rubin methodologies in causal inference by introducing a generalized approach that incorporates fuzzy logic. Indeed, we introduce a fuzzy causal inference approach that consider both the vagueness and imprecision inherent in data, as well as the subjective human perspective characterized by fuzzy terms such as 'high', 'medium', and 'low'. To do so, we introduce two fuzzy causal effect formulas: the Fuzzy Average Treatment Effect (FATE) and the Generalized Fuzzy Average Treatment Effect (GFATE), together with their normalized versions: NFATE and NGFATE. When dealing with a binary treatment variable, our fuzzy causal effect formulas coincide with classical Average Treatment Effect (ATE) formula, that is a well-established and popular metric in causal inference. In FATE, all values of the treatment variable are considered equally important. In contrast, GFATE takes into account the rarity and frequency of these values. We show that for linear Structural Equation Models (SEMs), the normalized versions of our formulas, NFATE and NGFATE, are equivalent to ATE. Further, we provide identifiability criteria for these formulas and show their stability with respect to minor variations in the fuzzy subsets and the probability distributions involved. This ensures the robustness of our approach in handling small perturbations in the data. Finally, we provide several experimental examples to empirically validate and demonstrate the practical application of our proposed fuzzy causal inference methods.


When AI Eats Itself: On the Caveats of Data Pollution in the Era of Generative AI

Xing, Xiaodan, Shi, Fadong, Huang, Jiahao, Wu, Yinzhe, Nan, Yang, Zhang, Sheng, Fang, Yingying, Roberts, Mike, Schönlieb, Carola-Bibiane, Del Ser, Javier, Yang, Guang

arXiv.org Artificial Intelligence

Generative artificial intelligence (AI) technologies and large models are producing realistic outputs across various domains, such as images, text, speech, and music. Creating these advanced generative models requires significant resources, particularly large and high-quality datasets. To minimize training expenses, many algorithm developers use data created by the models themselves as a cost-effective training solution. However, not all synthetic data effectively improve model performance, necessitating a strategic balance in the use of real versus synthetic data to optimize outcomes. Currently, the previously well-controlled integration of real and synthetic data is becoming uncontrollable. The widespread and unregulated dissemination of synthetic data online leads to the contamination of datasets traditionally compiled through web scraping, now mixed with unlabeled synthetic data. This trend portends a future where generative AI systems may increasingly rely blindly on consuming self-generated data, raising concerns about model performance and ethical issues. What will happen if generative AI continuously consumes itself without discernment? What measures can we take to mitigate the potential adverse effects? There is a significant gap in the scientific literature regarding the impact of synthetic data use in generative AI, particularly in terms of the fusion of multimodal information. To address this research gap, this review investigates the consequences of integrating synthetic data blindly on training generative AI on both image and text modalities and explores strategies to mitigate these effects. The goal is to offer a comprehensive view of synthetic data's role, advocating for a balanced approach to its use and exploring practices that promote the sustainable development of generative AI technologies in the era of large models.


Probabilistic Easy Variational Causal Effect

Faghihi, Usef, Saki, Amir

arXiv.org Machine Learning

Let $X$ and $Z$ be random vectors, and $Y=g(X,Z)$. In this paper, on the one hand, for the case that $X$ and $Z$ are continuous, by using the ideas from the total variation and the flux of $g$, we develop a point of view in causal inference capable of dealing with a broad domain of causal problems. Indeed, we focus on a function, called Probabilistic Easy Variational Causal Effect (PEACE), which can measure the direct causal effect of $X$ on $Y$ with respect to continuously and interventionally changing the values of $X$ while keeping the value of $Z$ constant. PEACE is a function of $d\ge 0$, which is a degree managing the strengths of probability density values $f(x|z)$. On the other hand, we generalize the above idea for the discrete case and show its compatibility with the continuous case. Further, we investigate some properties of PEACE using measure theoretical concepts. Furthermore, we provide some identifiability criteria and several examples showing the generic capability of PEACE. We note that PEACE can deal with the causal problems for which micro-level or just macro-level changes in the value of the input variables are important. Finally, PEACE is stable under small changes in $\partial g_{in}/\partial x$ and the joint distribution of $X$ and $Z$, where $g_{in}$ is obtained from $g$ by removing all functional relationships defining $X$ and $Z$.

  Country:
  Genre: Research Report (1.00)
  Industry: Health & Medicine (0.67)

A Comprehensive Study of Vision Transformers in Image Classification Tasks

Khalil, Mahmoud, Khalil, Ahmad, Ngom, Alioune

arXiv.org Artificial Intelligence

Image Classification is a fundamental task in the field of computer vision that frequently serves as a benchmark for gauging advancements in Computer Vision. Over the past few years, significant progress has been made in image classification due to the emergence of deep learning. However, challenges still exist, such as modeling fine-grained visual information, high computation costs, the parallelism of the model, and inconsistent evaluation protocols across datasets. In this paper, we conduct a comprehensive survey of existing papers on Vision Transformers for image classification. We first introduce the popular image classification datasets that influenced the design of models. Then, we present Vision Transformers models in chronological order, starting with early attempts at adapting attention mechanism to vision tasks followed by the adoption of vision transformers, as they have demonstrated success in capturing intricate patterns and long-range dependencies within images. Finally, we discuss open problems and shed light on opportunities for image classification to facilitate new research ideas.


Surface EMG-Based Inter-Session/Inter-Subject Gesture Recognition by Leveraging Lightweight All-ConvNet and Transfer Learning

Islam, Md. Rabiul, Massicotte, Daniel, Massicotte, Philippe Y., Zhu, Wei-Ping

arXiv.org Artificial Intelligence

Gesture recognition using low-resolution instantaneous HD-sEMG images opens up new avenues for the development of more fluid and natural muscle-computer interfaces. However, the data variability between inter-session and inter-subject scenarios presents a great challenge. The existing approaches employed very large and complex deep ConvNet or 2SRNN-based domain adaptation methods to approximate the distribution shift caused by these inter-session and inter-subject data variability. Hence, these methods also require learning over millions of training parameters and a large pre-trained and target domain dataset in both the pre-training and adaptation stages. As a result, it makes high-end resource-bounded and computationally very expensive for deployment in real-time applications. To overcome this problem, we propose a lightweight All-ConvNet+TL model that leverages lightweight All-ConvNet and transfer learning (TL) for the enhancement of inter-session and inter-subject gesture recognition performance. The All-ConvNet+TL model consists solely of convolutional layers, a simple yet efficient framework for learning invariant and discriminative representations to address the distribution shifts caused by inter-session and inter-subject data variability. Experiments on four datasets demonstrate that our proposed methods outperform the most complex existing approaches by a large margin and achieve state-of-the-art results on inter-session and inter-subject scenarios and perform on par or competitively on intra-session gesture recognition. These performance gaps increase even more when a tiny amount (e.g., a single trial) of data is available on the target domain for adaptation. These outstanding experimental results provide evidence that the current state-of-the-art models may be overparameterized for sEMG-based inter-session and inter-subject gesture recognition tasks.


Probabilistic Variational Causal Effect as A new Theory for Causal Reasoning

Faghihi, Usef, Saki, Amir

arXiv.org Artificial Intelligence

In this paper, we introduce a new causal framework capable of dealing with probabilistic and non-probabilistic problems. Indeed, we provide a direct causal effect formula called Probabilistic vAriational Causal Effect (PACE) and its variations satisfying some ideas and postulates. Our formula of causal effect uses the idea of the total variation of a function integrated with probability theory. The probabilistic part is the natural availability of changing an exposure values given some variables. These variables interfere with the effect of the exposure on a given outcome. PACE has a parameter $d$ determining the degree of considering the natural availability of changing the exposure values. The lower values of $d$ refer to the scenarios for which rare cases are important. In contrast, with the higher values of $d$, our framework deals with the problems that are in nature probabilistic. Hence, instead of a single value for causal effect, we provide a causal effect vector by discretizing $d$. Further, we introduce the positive and negative PACE to measure the positive and the negative causal changes in the outcome while changing the exposure values. Furthermore, we provide an identifiability criterion for PACE to deal with observational studies. We also address the problem of computing counterfactuals in causal reasoning. We compare our framework to the Pearl, the mutual information, the conditional mutual information, and the Janzing et al. frameworks by investigating several examples.